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1. INTRODUCTION 

Agriculture is a basic need for humankind to subsist. Continuous increment in population strains to 
feed the ever-growing population. Resources and food production management is required to cater for the 
augmented population. Agriculture production relies on many factors, such as soil type and quality, irrigation 
management, weather, and water. Agriculture is a basic need for humankind to subsist. Continuous increment 
in population strains to feed the ever-growing population. Resources and food production management is 
required to cater for the augmented population. Farming has become more intensified to maximize crop 
yields. To produce the sufficient amount of food, smart agriculture is required. Satellite data makes 
agriculture more accurate and predictive. Smart farming has evolved widely in the last few years to fulfil the 
food need. 

Machine learning (ML) in consort with data analysis generates possibilities to understand and 
reconnoitre the field of agriculture more effectually. According to Tom Michael, ML is a set of computer 
instructions that learns from previous experience, concerning the task, and on the basis of previous 
experience and task, performance is measured and which improves with experience and task [1]. Samuel 
defines ML as a scientific domain of study which provides machines with the ability to learn without being 
specifically programmed [2]. With time, machine learning is being widely applied in many fields, including 
bioinformatics [3], anatomy [4], cheminformatics [5], economics [6], robot locomotion [7], speech 
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recognition [8], information retrieval [9], and neuroscience [10]. In this research paper, machine learning 
algorithm in agriculture domain is deliberated [11]. 

The organization of the paper is: machine approach section has the description of machine learning 
methods, techniques, and algorithms, the literature review section contains the review of the identified areas 
of agriculture that have used machine learning, and discussion and conclusion section encloses the final 
findings, conclusion and discussion of the paper along with the advantages of application of machine learning 
in agriculture domain. ML is a process where the system or machine learns from experience and can improve 
performance. Statistical and mathematical models can measure improved performance. Set of examples can 
also be dictated as ML model or algorithms are trained using data sets. After the accomplishment of training, 
the trained model is used to identify, predict or classify new input data. Figure | illustrates the ML approach. 
ML algorithms explained below are not limited to the methods applied in papers used for this review process. 


Training Data 


Machine Learning 
Algorithms 


New Input Data 


Trained Model 


Result 


Figure 1. Machine learning approach 


2. LITERATURE REVIEW 
2.1. Research method 

A systematic review methodology has been followed for the review conduction used in this research 
paper. The review process includes review planning, search string, and search criteria for Machine learning in 
agriculture. After completing the search, the paper selection is made based on inclusion and exclusion 
criteria. This section contains information about how the review is accomplished. 


2.2. Planning of review 

Machine learning has evolved in agriculture rapidly in past years. However, despite numerous 
research studies, the potential results for every field have not been identified yet. This review aims to provide 
an outline of the machine learning technology in the agriculture domain and in-depth investigation. The work 
analyses various sub-categories of the agriculture domain, techniques applied, observed features, and dataset 
resources used in the research. 


2.3. Search string 

To conduct the search string, some keywords are identified as agriculture machine learning, ML 
techniques agriculture, crop yield prediction machine learning, pest machine learning, crop disease machine 
learning, soil machine learning, and weed machine learning, with the main emphasis on keywords machine 
learning and agriculture. The authors performed an in-depth search to ensure the comprehensiveness of the 
study. A few known papers may not have been considered because of title mismatch with the identified 
keywords. Figure 2 represents the chosen search strings. 
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2.4. Selection criteria 

The literature review follows pre-specified selection criteria for including and excluding the papers 
in the study. The inclusion criteria include the paper which matches the search string, and exclusion criteria 
excluded the papers by title and domain mismatch, abstract and text irrelevance. Figure 3 illustrates the paper 
inclusion and exclusion. 
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Figure 2. Search string 
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Figure 3. Inclusion and exclusion criteria 


2.5. Review conduction 

Machine learning is a game-changing technology and widely used in diversified fields. Machine 
learning has been applied in the agricultural domain throughout the crop cycle. It starts with soil management 
and ends with taking decisions about the crop's ripeness by the robot. In this review, articles have been 
classified into the following categories: crop yield prediction, soil management, pest management, weed 
management, and crop disease. The papers were searched using particular keywords for every selected 
domain of agriculture. Agriculture has many sub-areas, and all cannot be included in the review; considering 
this constraint, some areas are excluded. General abbreviations used in the paper are compiled in Table 1. 


2.6. Categorical literature review 
2.6.1. Crop yield prediction 

In agriculture, crop yield, also known as agriculture output, is an essential component to complete 
the growing population's need. Agriculture crop yield or productivity depends on many factors, such as 
weather conditions, soil conditions, water, temperature, and rainfall. Therefore, ML can match the demand 
and supply of food without affecting the environment or natural resources. 


2.6.2. Soil management 

Machine learning implementation has been used to predict and identify based on soil characteristics 
such as valuation of soil moisture, condition, and temperature. A better prediction of soil condition can help 
to improve soil management. ML technologies can achieve a more accurate estimation of soil with less time 
and cost. 
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Table 1. General abbreviations 


Abbreviation Definition Abbreviation Definition 
AMSR-E Advanced Microwave Scanning Radiometer on the Earth MODIS Moderate Resolution Imaging 
Observing System Spectroradiometer 
ANN Artificial Neural Network MPE Mean percent error 
AI Artificial Intelligence NB Naive Bayes 
CNN Convolutional Neural Network NDVI Normalized difference vegetation index 
CP-ANN Counter Propagation Artificial Neural Network NN Neural Network 
DL Deep Learning PCA Principal Component Analysis 
DT Decision Tree PLSDA Partial Least Squares Discriminant 
Analysis 
EL Ensemble Learning PMNN Perceptron Multilayer neural network 
ELM Extreme Learning Machine RBF-NN Radial Basis Function Neural Network 
EM Expectation Maximisation RE Relative Error 
ERT Extremely randomized tree RF Random Forest 
LR Logistic Regression RMSE Root mean square error 
LS-SVM Least Squares Support Vector Machines SOM Self-Organizing Map 
LSTM Long short-term memory SVM Support Vector Machine 
ML Machine Learning SVR Support Vector Regression 
MLR Multiple Linear Regression 


2.6.3. Pest management 

Pest damages the crops and reduces production, which can rigorously affect the food supply and 
demand chain. Reduction of the crop damage and increment of the crop production compels the farmers to 
use chemicals to control and protect the field from pests. Even though utilization of chemicals is harmful to 
the environment, animals and human's health, ML algorithms can provide an efficient solution for pest 
management. 


2.6.4. Weed management 

Weed in farming is the most undesirable plant that rivals the yield. It makes harvesting difficult and 
includes impurity and moisture to crop. The negative effects of weeds on yields incorporate challenge to 
sunlight, water, space, complex harvesting, and devaluation of crop quality. ML can detect weed on the crop. 
Many articles have been presented here to detect and discriminate weed from the crop. 


2.6.5. Crop disease 

The rapidly increasing world population puts much pressure on agriculture resources. Crop 
Production is the essential component to maintain the population need as well as the economic system. Crop 
diseases are the primary source of plant damage, which affects crop production. Due to distressed climate and 
environmental situations, a manifestation of plant illnesses is at the upward thrust. There are numerous crop 
diseases and various symptoms containing spots/smudge appearing on plant leaves [12]. ML techniques 
accommodated to detect the disease in the plant at an early stage. The Table 2 shown in appendix represents 
the comparison of above-mentioned categories. 


3. DISCUSSION 

The review's primary focus is to brief the significant benefits of ML in the agriculture domain and 
possible research areas. The review analyses the existing machine learning tools and techniques deployed in 
the agriculture domain, including crop prediction, soil management, pest management, weed management 
and crop disease. Many international journals cover the advances in the development and applications of 
hardware, software, and related technologies for solving issues in the agriculture domain. The total number of 
research articles reviewed is 38. The review includes 3 conference and 35 journal articles, as shown in 
Figure 4. The presented articles here are from 2005 to till present, shown in Figure 5. The year-wise 
distribution of reviewed papers is demonstrated in Figure 5. The result clearly shows that there is significant 
work done in the last 3 to 4 years in agriculture using machine learning. 

Analysis of the articles indicates that mainly nine ML algorithms are examined/adopted in the 
survey, shown in Figure 6. In crop prediction, Nine ML algorithms are deployed; further analysis of the 
surveyed articles indicates that ANN is the most popular algorithm applied in the field of crop prediction. In 
soil management, five ML algorithms are deployed where SVM and regression are mainly used. In the pest 
management category, five ML algorithms are deployed where SVM is majorly used. In Weed management, 
five ML algorithms are implemented and, SVM is most often used. In last, crop disease, four ML algorithms 
are implemented and, SVM is majorly used. Thus, the majority of work is done using ANN and SVM can be 
concluded from the reviewed literature. 
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Figure 6. Utilization of ML algorithms in different categories 


The analysis of figures indicates that SVM is majorly implemented because of its sequential 
approach, which incorporates several features to make a decision/ features into classes. SVM uses a kernel 
function to differentiate the nonlinear and separable data and generates a mapping relationship between the 
input vector and high-dimensional space vector through a hyperplane. SVM is preferred because of its sparse 
representation and absence of local minima. Machine learning has a significant impact on application areas of 
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the agriculture domain. Results produced by ML are promising. Particularly DL is getting more acceptance 
because of its automatic feature extraction method in the agriculture sector, which can ease the process and 
support the stakeholders of the agriculture domain. DL architectures/algorithms are also vastly implemented 
in crop disease, weed management and crop prediction domains. 


4. CONCLUSION 

ML-based techniques have attracted much attention from researchers to improve the productivity in 
agriculture domain. This review summarises the implementation of the ML algorithm in the agriculture 
domain in the past few years. Though many algorithms are deployed, SVM and neural networks are the key 
techniques to be better and precise. However, the researcher can explore new techniques, new domain, and 
the inclusion of raw data to get more accurate results in the future. Deep learning is getting attention in the 
past 3-4 years. The review covers five major domains; however, further study is required to explore the other 
research areas of agriculture: rain management, weather Management, climate management, livestock 


production, and animal welfare. 


APPENDIX 
Table 2. Comparison among multiple agriculture domains (continue) 
Reference Agriculture Observed Functionality Applied Data Sources Results 
No. Domain Features Algorithms 
[13] Crop Seven-band Remote sensing LSTM, Moderate resolution Trained and tested 
prediction reflectance data used to train regression imaging spectroradiometer the model on 
imagery the model to satellite imagery soybean data of 
predict the crop Argentina and 
yield of one predicts fine for 
region. Then, brazil. 
another region Pre-trained model 
prediction was 
performed using 
transfer learning 
[14] Crop Soil Estimates crop SVM, ERT, National Agricultural DL produced the 
prediction moisture yield and present RF, DL Statistical Service and highest accuracy 
comparison among United States of among all 
many machine Department of Agriculture, 
learning National Aeronautics and 
techniques Space Administration, 
European Space Agency, 
Climate Change Initiative 
and PRISM Climate Group 
[15] Crop Multiple Detects each X-Means, 154 images were collected Recall value: 0.80 
prediction features integral tomato DT by conventional RGB Precision: 0.88 
color, shape, fruit which digital camera at Tsukuba Recall of young 
texture and incorporates Plant Factory of the fruit: 0.78 
size mature, immature, Institute of Vegetable and 
and young fruits Tea Science, Ibaraki, Japan 
on a tomato plant 
[16] Crop Multilayer Predicts wheat CP-ANN, Duck End Farm Field, Accuracy: 
prediction soil yield for three XY-Fusion, Wilstead, Bedfordshire, U. Supervised kohonen 
parameters isofrequency Supervised K. network: 81.65% 
classes, namely Kohonen CP-ANN: 78.3% 
high, medium and Network XY-Fusion: 80.92% 
low 
[17] Crop Geometrical Detects tomatoes K-Means, RGB images of Spatial K-means 
prediction features from RGB images SOM, EM resolution acquired from Precision: 0.723 


unmanned aerial vehicles 


Recall: 0.593 
F-Measure: 0.652 


Precision: 0.730 
Recall: 0.686 
F-Measure: 0.707 
EM 
Precision: 0.919 
Recall: 0.606 
F-Measure: 0.730 
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Table 2. Comparison among multiple agriculture domains (continue) 
Reference _ Agriculture Observed Functionality Applied Data Sources Results 
No. Domain Features Algorithms 
[18] Crop Soil SBOCM used to SVM Chinese Middle-season rice 
prediction properties predict different Academy of Tillering stage: 
stages and yield of Sciences RE(%)=22.1 
rice Heading stage: 
RE(%)=17.1 
Milk stage: 
RE(%)=19.2 
Early rice Tillering stage: 
RE(%)=20.5 Heading stage: 
RE(%)=15.8 
Milk stage: 
RE(%)=8.5 
Late rice: Tillering stage: 
RE(%)=21.0 
Heading stage: 
RE(%)=16.5 
Milk stage: 
RE(%)=11.1 
[19] Crop Irrigation Crop yield MLR, M5- Irrigation MS5-Prime predicted with the 
prediction water, prediction Prime module of best accuracy, followed by 
rainfall, performed for two Regression Santa Rosa KNN, SVR and MLR. 
temperature consecutive years Trees, [Agricultural 
PMNN, Production 
SVR, K-NN Data and 
Weather 
information 
Data] 
[20] Crop Vegetation Determines the ANN Emile A. Lods RMSE (kg/ha)= 19.7 
prediction indices potential of Agronomy 
hyperspectral data Research 
and ANNs Centre data 
obtained by 
Compact 
Airborne 
Spectrographic 
Image 
[21] Soil N/A Predict soil texture SVM, ANN Tuscany, RMSE: 
management and stoniness Central Italy SVM 
based on y- Sand: 7.0 
spectroscopy Clay: 5.9 
Stoniness:0.10 
ANN 
Sand:7.9 
Clay:6.3 
Stoniness:0.11 
[22] Soil N/A Crop yield Stepwise Lower seyhan MPE: 
management prediction based linear plane, berdan, Wheat: 7.9% 
on soil salinity regression seyhan, and Corn: 8.8% 
ceyhan rivers Cotton:6.3% 
Crop Yield loss: 
Corn: 55% 
wheat: 28% 
Cotton: 15% 
[23] Soil N/A AMSR-E data is RF Global change Coefficient correlation (r) 
management consistently used master South korea:0.71 
to observe patterns directory and Australis: 0.84 
of Global soil tural RMSE: 
moisture development South Korea:0.049 
administration Australia: 0.05 
[24] Soil N/A Uses near-infrared LS-SVM, Top soil layer RMSE of prediction 
management and visible bands Cubist from Premslin, LS-SVM: 
to predict soil Germany. Moisture content: 0.457% 
nitrogen, organic Organic carbon: 0.062% 
carbon, and 
moisture 
[25] Soil N/A Predicts soil SVM Chi-Chi, Performance: 77.65% 
management liquefaction Taiwan 
susceptibility earthquake. 
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Table 2. Comparison among multiple agriculture domains (continue) 


Reference Agriculture Observed Functionality Applied Data Sources Results 
No. Domain Features Algorithms 
[26] Soil N/A Implemented digital Cubist, RF Borujen region, soil organic carbon: 
management soil mapping Chaharmahal-Va- RMSE: 0.33(RF) 
techniques to estimate Bakhtiari Province, calcium carbonate 
the spatial distribution central Iran equivalent 
of numerous soil RMSE: 9.52(Cubist) 
properties Clay: 
RMSE: 7.86(RF) 
[27] Soil N/A Defines and assesses CNN LUCAS Soil RMSE 
management the efficiency of database Organic carbon: 10.5% 
transfer learning to Cation exchange 
localize capacity: 11.8% 
Clay content: 12.0% 
pH: 11.5% 
[28] Pest Color, Automated rice pest SVM Live images with Accuracy 97.5% 
management Shape, identification system cameras 
Texture 
[29] Pest Area, Detect individual pest ANN Sugar beet field in R=0.89 
management Perimeter, among other species Shiraz, Iran 
sphericity, 
Eccentricity 
[30] Pest Curve Diagnosis of plant pest SVM Lancaster Tomato (mildew) 
management response using Electronic nose. University, UK Linear: 95% 
and slope Polynomial: 94% 
RBF: 96% 
Cucumber (wounded) 
Linear: 77% 
Polynomial: 82% 
RBF: 87% 
Cucumber (spider mite) 
Linear: 94% 
Polynomial: 88% 
RBF: 91% 
Pepper (wounded) 
Linear: 67% 
Polynomial:71% 
RBF: 92% 
[31] Pest 58 attributes Develop a method to AdaBoost, Zespri International Precision: 
management forecast the result of NB Ltd AdaBoost: 98% 
pest monitoring. Naive Bayes: 95% 
[32] Pest N/A Detects and classifies DL 88,670 images Mean average Precision: 
management multi-class pests. 75.46% 
[33] Pest color Automatically detects SVM Tarbiat Modares MPE of less than 2.25% 
management indexes thrips and their University, 
were: Hue, position. Islamic Republic of 
Saturation Tran, Tehran 
and 
Intensify 
[34] Weed Color, Pynovisao software CNN Images captured by CNN: 
management shape, developed and used to unmanned aerial Precision 0.991 
texture and detect weed in crop vehicle. Sensitivity 0.991 
image image and classified 
orientation using CNN. 
[35] Weed Nitrogen Weed classification SVM 72-waveband Effect of nitrogen and 
management _—_ application performed w.r.t. compact airborne weed combined: 69.2% 
rate: 60,120 nitrogen application spectrographic Effect of nitrogen:80.8 
and 250 kg rate imager (CASI), Effect of weed: 85.8 
N/ha range: 408.73 to 
947.07 nm 
[36] Weed Color and Weed discrimination DT Rice and weed Precision: 0.982 
management texture for different growing images from the Recall: 0.977 
states of rice internet of 
1125*1500 
[37] Weed Spectral Recognizes weed SOM, Hyperspectral Mixture of Gaussian- 
management species based on Mixture of images using HSI. 31%-98% 
hyperspectral sensing. Gaussian SOM- 53%-94% 
[38] Weed Color, Weed and crop were SVM OLYMPUS FE4000 Accuracy- 97% 
managemen moment classified using digital point-and-shoot 
invariant, images. digital camera 
size 
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Table 2. Comparison among multiple agriculture domains 
Reference Agriculture Observed Functionality Applied Data Sources Results 
No. Domain Features Algorithms 
39] Weed Shape, Fourier Weed detection SVM, ANN —_960x1280 pixels, Accuracy: 
management descriptor, using shape features Shiraz university. ANN: 92.92% 
moment SVM: 95.00% 
invariant 
40] Weed RGB-NIR Detect sugar beet CNN UAVs equipped Accuracy 95% 
management imagery plant and weed- with vision 
based on vision sensors 
classification 

41] Weed Size, length, and Classification for SVM Red (580 nm) and ~— Overall accuracy: 97.7% 

management fourier small-grain weed infrared (>720 
species concerning nm) spectrum 
cirsium arvense and 
galium aparine 
[42] Crop disease Hyperspectral Detecting sclerotinia PLSDA, farm of Zhejiang Sample set 1: 
imaging with 2.8 sclerotiorum on RBF-NN, University Average spectra: 
mm spectral oilseed rape stems SVM, and PLSDA: 100 
resolution, pixel ELM RBFNN: 97.50 
size is 6.45x6.45 ELM: 100 
pm SVM: 92.50 
Pixel-wise Spectra: 
PLSDA: 94.80 
RBENN: 98.80 
ELM: 99.40 
SVM: 99.00 
Sample set 2: 
Average spectra: 
PLSDA: 92.50 
RBENN: 87.50 
ELM: 97.50 
SVM: 90.00 
Pixel-wise Spectra: 
PLSDA: 96.60 
RBENN: 98.70 
ELM: 99.50 
SVM: 99.30 
43] Crop disease Leaf, stem, and Detect real-time DL Images using a Mean average precision 
fruits disease along with digital camera 83.06% 
the class and from farms of the 
location of the plant Korean peninsula 
44] Crop disease Spectral Detects and SVM Cercospora leaf Cercospora Leaf spot: 
vegetation classifies plant spot, leaf rust and 89.69 
indices diseases in sugar powdery mildew Sugar beet rust: 83.60 
beet Powdery mildew: 92.46 
45] Crop disease Coloured, Detects plant disease CNN PlantVillage Overall accuracy- 
greyscale and using images Public dataset 99.35% 
segmented 
46] Crop disease 75 features by Healthy and KNN GAP Agricultural KNN: 
wavelet fusarium diseased research Statistics of wavelet 
decomposition pepper leaves were (GAPTEAM), coefficient: 99% 
detected sanliurfa, Turkey Wavelet Coefficient: 
100% 

47] Crop disease Grayscale Detect and classify CNN Images captured Dataset split: 
potato disease by by cameras 90%-train and 10%-test 
visible symptoms provides accuracy - 

0.9585 
48] Crop disease Shape, texture, Identification of SVM The University of Accuracy 93.1% 
and grey level plant disease by Georgia, USA 
visual symptoms 

49] Crop disease Leaf properties Classifies the CNN Plant village Accuracy 99.18% 
disease based on 
symptoms visible 

50] Crop disease Color, texture, Detects disease in ANN, ANN, SVR-RBF, RMSE: 

gray level co- apple fruit SVR-rbf, and SVR-Poly ANN: 0.53 
occurrence and SVR- SVR-Poly: 0.42 
matrix, and Poly SVR-RBF: 0.2 

wavelet 
transform 
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